Improving PP Attachment Disambiguation in a Rule-based Parser

نویسندگان

  • Yoon-Hyung Roh
  • Ki-Young Lee
  • Young-Gil Kim
چکیده

This paper deals with how to enhance the performance of a rule-based parser using statistical Information. PP (Prepositional Phrase) attachment ambiguity is one of the main ambiguities found in parsing. We therefore conducted some experiments on extracting statistical information for PP attachment from a corpus, and on applying such information to a rule-based parser. Two types of information are used: supervised learning data and unsupervised learning data. In this paper, we show how we apply these types of information and to what degree they contribute to the PP attachment as well as to the overall parsing performance. The final results show a 5.42% performance improvement in PP attachment, with an 8.7% error reduction ratio in the overall parsing performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Benefit of Stochastic PP Attachment to a Rule-Based Parser

To study PP attachment disambiguation as a benchmark for empirical methods in natural language processing it has often been reduced to a binary decision problem (between verb or noun attachment) in a particular syntactic configuration. A parser, however, must solve the more general task of deciding between more than two alternatives in many different contexts. We combine the attachment predicti...

متن کامل

Improving Parsing and PP Attachment Performance with Sense Information

To date, parsers have made limited use of semantic information, but there is evidence to suggest that semantic features can enhance parse disambiguation. This paper shows that semantic classes help to obtain significant improvement in both parsing and PP attachment tasks. We devise a gold-standard senseand parse tree-annotated dataset based on the intersection of the Penn Treebank and SemCor, a...

متن کامل

Using Bilingual Chinese-English Word Alignments to Resolve PP-Attachment Ambiguity in English

Errors in English parse trees impact the quality of syntax-based MT systems trained using those parses. Frequent sources of error for English parsers include PP-attachment ambiguity, NP-bracketing ambiguity, and coordination ambiguity. Not all ambiguities are preserved across languages. We examine a common type of ambiguity in English that is not preserved in Chinese: given a sequence “VP NP PP...

متن کامل

Generalised PP-attachment Disambiguation Using Corpus-based Linguistic Diagnostics

We propose a new formulation of the PP attachment problem as a 4-way classification which takes into account the argument or adjunct status of the PP. Based on linguistic diagnostics, we train a 4-way classifier that reaches an average accuracy of 73.9% (baseline 66.2%). Compared to a sequence of binary classifiers, the 4-way classifier reaches better performance and individuates a verb's argum...

متن کامل

Acquisition et évaluation sur corpus de propriétés de sous-catégorisation syntaxique

We carry out an experiment aimed at using subcategorization information into a syntactic parser for PP attachment disambiguation. The subcategorization lexicon consists of probabilities between a word (verb, noun, adjective) and a preposition. The lexicon is acquired automatically from a 200 million word corpus, that is partially tagged and parsed. In order to assess the lexicon, we use four di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011